13 research outputs found
Coping with Data Scarcity in Deep Learning and Applications for Social Good
The recent years are experiencing an extremely fast evolution of the Computer Vision and
Machine Learning fields: several application domains benefit from the newly developed
technologies and industries are investing a growing amount of money in Artificial Intelligence.
Convolutional Neural Networks and Deep Learning substantially contributed to the rise and
the diffusion of AI-based solutions, creating the potential for many disruptive new businesses.
The effectiveness of Deep Learning models is grounded by the availability of a huge
amount of training data. Unfortunately, data collection and labeling is an extremely expensive
task in terms of both time and costs; moreover, it frequently requires the collaboration of
domain experts.
In the first part of the thesis, I will investigate some methods for reducing the cost
of data acquisition for Deep Learning applications in the relatively constrained industrial
scenarios related to visual inspection. I will primarily assess the effectiveness of Deep Neural
Networks in comparison with several classical Machine Learning algorithms requiring a
smaller amount of data to be trained. Hereafter, I will introduce a hardware-based data
augmentation approach, which leads to a considerable performance boost taking advantage of
a novel illumination setup designed for this purpose. Finally, I will investigate the situation in
which acquiring a sufficient number of training samples is not possible, in particular the most
extreme situation: zero-shot learning (ZSL), which is the problem of multi-class classification
when no training data is available for some of the classes. Visual features designed for image
classification and trained offline have been shown to be useful for ZSL to generalize towards
classes not seen during training. Nevertheless, I will show that recognition performances
on unseen classes can be sharply improved by learning ad hoc semantic embedding (the
pre-defined list of present and absent attributes that represent a class) and visual features, to
increase the correlation between the two geometrical spaces and ease the metric learning
process for ZSL.
In the second part of the thesis, I will present some successful applications of state-of-the-
art Computer Vision, Data Analysis and Artificial Intelligence methods. I will illustrate
some solutions developed during the 2020 Coronavirus Pandemic for controlling the disease
vii
evolution and for reducing virus spreading. I will describe the first publicly available
dataset for the analysis of face-touching behavior that we annotated and distributed, and
I will illustrate an extensive evaluation of several computer vision methods applied to the
produced dataset. Moreover, I will describe the privacy-preserving solution we developed
for estimating the \u201cSocial Distance\u201d and its violations, given a single uncalibrated image
in unconstrained scenarios. I will conclude the thesis with a Computer Vision solution
developed in collaboration with the Egyptian Museum of Turin for digitally unwrapping
mummies analyzing their CT scan, to support the archaeologists during mummy analysis
and avoiding the devastating and irreversible process of physically unwrapping the bandages
for removing amulets and jewels from the body
Weakly Supervised Geodesic Segmentation of Egyptian Mummy CT Scans
In this paper, we tackle the task of automatically analyzing 3D volumetric
scans obtained from computed tomography (CT) devices. In particular, we address
a particular task for which data is very limited: the segmentation of ancient
Egyptian mummies CT scans. We aim at digitally unwrapping the mummy and
identify different segments such as body, bandages and jewelry. The problem is
complex because of the lack of annotated data for the different semantic
regions to segment, thus discouraging the use of strongly supervised
approaches. We, therefore, propose a weakly supervised and efficient
interactive segmentation method to solve this challenging problem. After
segmenting the wrapped mummy from its exterior region using histogram analysis
and template matching, we first design a voxel distance measure to find an
approximate solution for the body and bandage segments. Here, we use geodesic
distances since voxel features as well as spatial relationship among voxels is
incorporated in this measure. Next, we refine the solution using a GrabCut
based segmentation together with a tracking method on the slices of the scan
that assigns labels to different regions in the volume, using limited
supervision in the form of scribbles drawn by the user. The efficiency of the
proposed method is demonstrated using visualizations and validated through
quantitative measures and qualitative unwrapping of the mummy
Single Image Human Proxemics Estimation for Visual Social Distancing
In this work, we address the problem of estimating the so-called "Social
Distancing" given a single uncalibrated image in unconstrained scenarios. Our
approach proposes a semi-automatic solution to approximate the homography
matrix between the scene ground and image plane. With the estimated homography,
we then leverage an off-the-shelf pose detector to detect body poses on the
image and to reason upon their inter-personal distances using the length of
their body-parts. Inter-personal distances are further locally inspected to
detect possible violations of the social distancing rules. We validate our
proposed method quantitatively and qualitatively against baselines on public
domain datasets for which we provided groundtruth on inter-personal distances.
Besides, we demonstrate the application of our method deployed in a real
testing scenario where statistics on the inter-personal distances are currently
used to improve the safety in a critical environment.Comment: Paper accepted at WACV 2021 conferenc
Sources of health financing and health outcomes: A panel data analysis
Ministry of Education, Singapore under its Academic Research Funding Tier
Time Warping Under Dynamic Constraints
Abstract. Action and event recognition from video require comparing temporal sequences of images, or of intermediate representations derived from them. Such a comparison should be insensitive to intrinsic temporal variations within the same class – for instance the speed of execution of a particular gesture – and at the same time retain discriminative power to enable classifying different actions. In this paper, we propose a technique to compare temporal sequences that accounts for dynamic constraints implicit in the data generation process. Our technique is more flexible than those previously used for quasi-periodic actions such as walking gaits, but more discriminative than others based on dynamic time warping that do not satisfy dynamic constraints. We illustrate our approach on public datasets including stationary and non-stationary actions, using both motion-capture and image data.
End-to-end pairwise human proxemics from uncalibrated single images
In this work, we address the ill-posed problem of estimating pairwise metric distances between people using only a single uncalibrated image. We propose an end-to-end model, DeepProx, that takes as inputs two skeletal joints as a set of 2D image coordinates and outputs the metric distance between them. We show that an increased performance is achieved by a geometrical loss over simplified camera parameters provided at training time. Further, DeepProx achieves a remarkable generalisation over novel viewpoints through domain generalisation techniques. We validate our proposed method quantitatively and qualitatively against baselines on public datasets for which we provided groundtruth on interpersonal distances
Functional Polymeric Membranes with Antioxidant Properties for the Colorimetric Detection of Amines
Herein, the ability of highly porous colorimetric indicators to sense volatile and biogenic amine vapors in real time is presented. Curcumin-loaded polycaprolactone porous fiber mats are exposed to various concentrations of off-flavor compounds such as the volatile amine trimethylamine, and the biogenic amines cadaverine, putrescine, spermidine, and histamine, in order to investigate their colorimetric response. CIELAB color space analysis demonstrates that the porous fiber mats can detect the amine vapors, showing a distinct color change in the presence of down to 2.1 ppm of trimethylamine and ca. 11.0 ppm of biogenic amines, surpassing the limit of visual perception in just a few seconds. Moreover, the color changes are reversible either spontaneously, in the case of the volatile amines, or in an assisted way, through interactions with an acidic environment, in the case of the biogenic amines, enabling the use of the same indicator several times. Finally, yet importantly, the strong antioxidant activity of the curcumin-loaded fibers is successfully demonstrated through DPPHâ—Ź and ABTSâ—Ź radical scavenging assays. Through such a detailed study, we prove that the developed porous mats can be successfully established as a reusable smart system in applications where the rapid detection of alkaline vapors and/or the antioxidant activity are essential, such as food packaging, biomedicine, and environmental protection